Dataset statistics
| Number of variables | 36 |
|---|---|
| Number of observations | 26707 |
| Missing cells | 60762 |
| Missing cells (%) | 6.3% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 7.3 MiB |
| Average record size in memory | 288.0 B |
Variable types
| CAT | 16 |
|---|---|
| BOOL | 13 |
| NUM | 7 |
Reproduction
| Analysis started | 2020-09-14 23:44:13.646257 |
|---|---|
| Analysis finished | 2020-09-14 23:44:39.101994 |
| Duration | 25.46 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
doctor_recc_h1n1 has 2160 (8.1%) missing values | Missing |
doctor_recc_seasonal has 2160 (8.1%) missing values | Missing |
chronic_med_condition has 971 (3.6%) missing values | Missing |
child_under_6_months has 820 (3.1%) missing values | Missing |
health_worker has 804 (3.0%) missing values | Missing |
health_insurance has 12274 (46.0%) missing values | Missing |
opinion_h1n1_vacc_effective has 391 (1.5%) missing values | Missing |
opinion_h1n1_risk has 388 (1.5%) missing values | Missing |
opinion_h1n1_sick_from_vacc has 395 (1.5%) missing values | Missing |
opinion_seas_vacc_effective has 462 (1.7%) missing values | Missing |
opinion_seas_risk has 514 (1.9%) missing values | Missing |
opinion_seas_sick_from_vacc has 537 (2.0%) missing values | Missing |
education has 1407 (5.3%) missing values | Missing |
income_poverty has 4423 (16.6%) missing values | Missing |
marital_status has 1408 (5.3%) missing values | Missing |
rent_or_own has 2042 (7.6%) missing values | Missing |
employment_status has 1463 (5.5%) missing values | Missing |
employment_industry has 13330 (49.9%) missing values | Missing |
employment_occupation has 13470 (50.4%) missing values | Missing |
respondent_id has unique values | Unique |
| Distinct count | 26707 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13353.0 |
|---|---|
| Minimum | 0 |
| Maximum | 26706 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 208.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1335.3 |
| Q1 | 6676.5 |
| median | 13353 |
| Q3 | 20029.5 |
| 95-th percentile | 25370.7 |
| Maximum | 26706 |
| Range | 26706 |
| Interquartile range (IQR) | 13353 |
Descriptive statistics
| Standard deviation | 7709.791156 |
|---|---|
| Coefficient of variation (CV) | 0.5773826972 |
| Kurtosis | -1.2 |
| Mean | 13353 |
| Median Absolute Deviation (MAD) | 6677 |
| Skewness | 0 |
| Sum | 356618571 |
| Variance | 59440879.67 |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 7657 | 1 | < 0.1% | |
| 3371 | 1 | < 0.1% | |
| 13612 | 1 | < 0.1% | |
| 15661 | 1 | < 0.1% | |
| 9518 | 1 | < 0.1% | |
| 11567 | 1 | < 0.1% | |
| 21824 | 1 | < 0.1% | |
| 23873 | 1 | < 0.1% | |
| 17730 | 1 | < 0.1% | |
| Other values (26697) | 26697 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 26706 | 1 | < 0.1% | |
| 26705 | 1 | < 0.1% | |
| 26704 | 1 | < 0.1% | |
| 26703 | 1 | < 0.1% | |
| 26702 | 1 | < 0.1% |
h1n1_concern
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 92 |
| Missing (%) | 0.3% |
| Memory size | 208.6 KiB |
| 2 | |
|---|---|
| 1 | |
| 3 | |
| 0 |
| Value | Count | Frequency (%) | |
| 2 | 10575 | 39.6% | |
| 1 | 8153 | 30.5% | |
| 3 | 4591 | 17.2% | |
| 0 | 3296 | 12.3% | |
| (Missing) | 92 | 0.3% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
h1n1_knowledge
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 116 |
| Missing (%) | 0.4% |
| Memory size | 208.6 KiB |
| 1 | |
|---|---|
| 2 | |
| 0 | 2506 |
| Value | Count | Frequency (%) | |
| 1 | 14598 | 54.7% | |
| 2 | 9487 | 35.5% | |
| 0 | 2506 | 9.4% | |
| (Missing) | 116 | 0.4% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
behavioral_antiviral_meds
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 71 |
| Missing (%) | 0.3% |
| Memory size | 208.6 KiB |
| 0 | |
|---|---|
| 1 | 1301 |
| (Missing) | 71 |
| Value | Count | Frequency (%) | |
| 0 | 25335 | 94.9% | |
| 1 | 1301 | 4.9% | |
| (Missing) | 71 | 0.3% |
behavioral_avoidance
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 208 |
| Missing (%) | 0.8% |
| Memory size | 208.6 KiB |
| 1 | |
|---|---|
| 0 | |
| (Missing) | 208 |
| Value | Count | Frequency (%) | |
| 1 | 19228 | 72.0% | |
| 0 | 7271 | 27.2% | |
| (Missing) | 208 | 0.8% |
behavioral_face_mask
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 19 |
| Missing (%) | 0.1% |
| Memory size | 208.6 KiB |
| 0 | |
|---|---|
| 1 | 1841 |
| (Missing) | 19 |
| Value | Count | Frequency (%) | |
| 0 | 24847 | 93.0% | |
| 1 | 1841 | 6.9% | |
| (Missing) | 19 | 0.1% |
behavioral_wash_hands
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 42 |
| Missing (%) | 0.2% |
| Memory size | 208.6 KiB |
| 1 | |
|---|---|
| 0 | |
| (Missing) | 42 |
| Value | Count | Frequency (%) | |
| 1 | 22015 | 82.4% | |
| 0 | 4650 | 17.4% | |
| (Missing) | 42 | 0.2% |
behavioral_large_gatherings
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 87 |
| Missing (%) | 0.3% |
| Memory size | 208.6 KiB |
| 0 | |
|---|---|
| 1 | |
| (Missing) | 87 |
| Value | Count | Frequency (%) | |
| 0 | 17073 | 63.9% | |
| 1 | 9547 | 35.7% | |
| (Missing) | 87 | 0.3% |
behavioral_outside_home
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 82 |
| Missing (%) | 0.3% |
| Memory size | 208.6 KiB |
| 0 | |
|---|---|
| 1 | |
| (Missing) | 82 |
| Value | Count | Frequency (%) | |
| 0 | 17644 | 66.1% | |
| 1 | 8981 | 33.6% | |
| (Missing) | 82 | 0.3% |
behavioral_touch_face
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 128 |
| Missing (%) | 0.5% |
| Memory size | 208.6 KiB |
| 1 | |
|---|---|
| 0 | |
| (Missing) | 128 |
| Value | Count | Frequency (%) | |
| 1 | 18001 | 67.4% | |
| 0 | 8578 | 32.1% | |
| (Missing) | 128 | 0.5% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 2160 |
| Missing (%) | 8.1% |
| Memory size | 208.6 KiB |
| 0 | |
|---|---|
| 1 | |
| (Missing) | 2160 |
| Value | Count | Frequency (%) | |
| 0 | 19139 | 71.7% | |
| 1 | 5408 | 20.2% | |
| (Missing) | 2160 | 8.1% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 2160 |
| Missing (%) | 8.1% |
| Memory size | 208.6 KiB |
| 0 | |
|---|---|
| 1 | |
| (Missing) | 2160 |
| Value | Count | Frequency (%) | |
| 0 | 16453 | 61.6% | |
| 1 | 8094 | 30.3% | |
| (Missing) | 2160 | 8.1% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 971 |
| Missing (%) | 3.6% |
| Memory size | 208.6 KiB |
| 0 | |
|---|---|
| 1 | |
| (Missing) | 971 |
| Value | Count | Frequency (%) | |
| 0 | 18446 | 69.1% | |
| 1 | 7290 | 27.3% | |
| (Missing) | 971 | 3.6% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 820 |
| Missing (%) | 3.1% |
| Memory size | 208.6 KiB |
| 0 | |
|---|---|
| 1 | 2138 |
| (Missing) | 820 |
| Value | Count | Frequency (%) | |
| 0 | 23749 | 88.9% | |
| 1 | 2138 | 8.0% | |
| (Missing) | 820 | 3.1% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 804 |
| Missing (%) | 3.0% |
| Memory size | 208.6 KiB |
| 0 | |
|---|---|
| 1 | 2899 |
| (Missing) | 804 |
| Value | Count | Frequency (%) | |
| 0 | 23004 | 86.1% | |
| 1 | 2899 | 10.9% | |
| (Missing) | 804 | 3.0% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 12274 |
| Missing (%) | 46.0% |
| Memory size | 208.6 KiB |
| 1 | |
|---|---|
| 0 | 1736 |
| (Missing) |
| Value | Count | Frequency (%) | |
| 1 | 12697 | 47.5% | |
| 0 | 1736 | 6.5% | |
| (Missing) | 12274 | 46.0% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 391 |
| Missing (%) | 1.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.85062319501444 |
|---|---|
| Minimum | 1.0 |
| Maximum | 5.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 208.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 3 |
| median | 4 |
| Q3 | 5 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.007435774 |
|---|---|
| Coefficient of variation (CV) | 0.2616292801 |
| Kurtosis | 0.5155901708 |
| Mean | 3.850623195 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.9027057052 |
| Sum | 101333 |
| Variance | 1.014926839 |
| Value | Count | Frequency (%) | |
| 4 | 11683 | 43.7% | |
| 5 | 7166 | 26.8% | |
| 3 | 4723 | 17.7% | |
| 2 | 1858 | 7.0% | |
| 1 | 886 | 3.3% | |
| (Missing) | 391 | 1.5% |
| Value | Count | Frequency (%) | |
| 1 | 886 | 3.3% | |
| 2 | 1858 | 7.0% | |
| 3 | 4723 | 17.7% | |
| 4 | 11683 | 43.7% | |
| 5 | 7166 | 26.8% |
| Value | Count | Frequency (%) | |
| 5 | 7166 | 26.8% | |
| 4 | 11683 | 43.7% | |
| 3 | 4723 | 17.7% | |
| 2 | 1858 | 7.0% | |
| 1 | 886 | 3.3% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 388 |
| Missing (%) | 1.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.3425662069227555 |
|---|---|
| Minimum | 1.0 |
| Maximum | 5.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 208.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.285539287 |
|---|---|
| Coefficient of variation (CV) | 0.5487739398 |
| Kurtosis | -0.8466274374 |
| Mean | 2.342566207 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.6729565579 |
| Sum | 61654 |
| Variance | 1.652611258 |
| Value | Count | Frequency (%) | |
| 2 | 9919 | 37.1% | |
| 1 | 8139 | 30.5% | |
| 4 | 5394 | 20.2% | |
| 5 | 1750 | 6.6% | |
| 3 | 1117 | 4.2% | |
| (Missing) | 388 | 1.5% |
| Value | Count | Frequency (%) | |
| 1 | 8139 | 30.5% | |
| 2 | 9919 | 37.1% | |
| 3 | 1117 | 4.2% | |
| 4 | 5394 | 20.2% | |
| 5 | 1750 | 6.6% |
| Value | Count | Frequency (%) | |
| 5 | 1750 | 6.6% | |
| 4 | 5394 | 20.2% | |
| 3 | 1117 | 4.2% | |
| 2 | 9919 | 37.1% | |
| 1 | 8139 | 30.5% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 395 |
| Missing (%) | 1.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.357669504408635 |
|---|---|
| Minimum | 1.0 |
| Maximum | 5.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 208.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.362765917 |
|---|---|
| Coefficient of variation (CV) | 0.578013973 |
| Kurtosis | -1.01723403 |
| Mean | 2.357669504 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.6512805475 |
| Sum | 62035 |
| Variance | 1.857130945 |
| Value | Count | Frequency (%) | |
| 2 | 9129 | 34.2% | |
| 1 | 8998 | 33.7% | |
| 4 | 5850 | 21.9% | |
| 5 | 2187 | 8.2% | |
| 3 | 148 | 0.6% | |
| (Missing) | 395 | 1.5% |
| Value | Count | Frequency (%) | |
| 1 | 8998 | 33.7% | |
| 2 | 9129 | 34.2% | |
| 3 | 148 | 0.6% | |
| 4 | 5850 | 21.9% | |
| 5 | 2187 | 8.2% |
| Value | Count | Frequency (%) | |
| 5 | 2187 | 8.2% | |
| 4 | 5850 | 21.9% | |
| 3 | 148 | 0.6% | |
| 2 | 9129 | 34.2% | |
| 1 | 8998 | 33.7% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 462 |
| Missing (%) | 1.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.025985902076586 |
|---|---|
| Minimum | 1.0 |
| Maximum | 5.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 208.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 4 |
| median | 4 |
| Q3 | 5 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.086564904 |
|---|---|
| Coefficient of variation (CV) | 0.2698879057 |
| Kurtosis | 1.097349658 |
| Mean | 4.025985902 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -1.315176563 |
| Sum | 105662 |
| Variance | 1.18062329 |
| Value | Count | Frequency (%) | |
| 4 | 11629 | 43.5% | |
| 5 | 9973 | 37.3% | |
| 2 | 2206 | 8.3% | |
| 1 | 1221 | 4.6% | |
| 3 | 1216 | 4.6% | |
| (Missing) | 462 | 1.7% |
| Value | Count | Frequency (%) | |
| 1 | 1221 | 4.6% | |
| 2 | 2206 | 8.3% | |
| 3 | 1216 | 4.6% | |
| 4 | 11629 | 43.5% | |
| 5 | 9973 | 37.3% |
| Value | Count | Frequency (%) | |
| 5 | 9973 | 37.3% | |
| 4 | 11629 | 43.5% | |
| 3 | 1216 | 4.6% | |
| 2 | 2206 | 8.3% | |
| 1 | 1221 | 4.6% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 514 |
| Missing (%) | 1.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.719161608063223 |
|---|---|
| Minimum | 1.0 |
| Maximum | 5.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 208.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.385055181 |
|---|---|
| Coefficient of variation (CV) | 0.5093684676 |
| Kurtosis | -1.390171224 |
| Mean | 2.719161608 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.2509228606 |
| Sum | 71223 |
| Variance | 1.918377855 |
| Value | Count | Frequency (%) | |
| 2 | 8954 | 33.5% | |
| 4 | 7630 | 28.6% | |
| 1 | 5974 | 22.4% | |
| 5 | 2958 | 11.1% | |
| 3 | 677 | 2.5% | |
| (Missing) | 514 | 1.9% |
| Value | Count | Frequency (%) | |
| 1 | 5974 | 22.4% | |
| 2 | 8954 | 33.5% | |
| 3 | 677 | 2.5% | |
| 4 | 7630 | 28.6% | |
| 5 | 2958 | 11.1% |
| Value | Count | Frequency (%) | |
| 5 | 2958 | 11.1% | |
| 4 | 7630 | 28.6% | |
| 3 | 677 | 2.5% | |
| 2 | 8954 | 33.5% | |
| 1 | 5974 | 22.4% |
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 537 |
| Missing (%) | 2.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.1181123423767674 |
|---|---|
| Minimum | 1.0 |
| Maximum | 5.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 208.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.332949925 |
|---|---|
| Coefficient of variation (CV) | 0.6293103054 |
| Kurtosis | -0.600915934 |
| Mean | 2.118112342 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.9195873019 |
| Sum | 55431 |
| Variance | 1.776755503 |
| Value | Count | Frequency (%) | |
| 1 | 11870 | 44.4% | |
| 2 | 7633 | 28.6% | |
| 4 | 4852 | 18.2% | |
| 5 | 1721 | 6.4% | |
| 3 | 94 | 0.4% | |
| (Missing) | 537 | 2.0% |
| Value | Count | Frequency (%) | |
| 1 | 11870 | 44.4% | |
| 2 | 7633 | 28.6% | |
| 3 | 94 | 0.4% | |
| 4 | 4852 | 18.2% | |
| 5 | 1721 | 6.4% |
| Value | Count | Frequency (%) | |
| 5 | 1721 | 6.4% | |
| 4 | 4852 | 18.2% | |
| 3 | 94 | 0.4% | |
| 2 | 7633 | 28.6% | |
| 1 | 11870 | 44.4% |
age_group
Categorical
| Distinct count | 5 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 208.6 KiB |
| 65+ Years | |
|---|---|
| 55 - 64 Years | |
| 45 - 54 Years | |
| 18 - 34 Years | |
| 35 - 44 Years |
| Value | Count | Frequency (%) | |
| 65+ Years | 6843 | 25.6% | |
| 55 - 64 Years | 5563 | 20.8% | |
| 45 - 54 Years | 5238 | 19.6% | |
| 18 - 34 Years | 5215 | 19.5% | |
| 35 - 44 Years | 3848 | 14.4% |
Length
| Max length | 13 |
|---|---|
| Median length | 13 |
| Mean length | 11.97510016 |
| Min length | 9 |
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 1407 |
| Missing (%) | 5.3% |
| Memory size | 208.6 KiB |
| College Graduate | |
|---|---|
| Some College | |
| 12 Years | |
| < 12 Years |
| Value | Count | Frequency (%) | |
| College Graduate | 10097 | 37.8% | |
| Some College | 7043 | 26.4% | |
| 12 Years | 5797 | 21.7% | |
| < 12 Years | 2363 | 8.8% | |
| (Missing) | 1407 | 5.3% |
Length
| Max length | 16 |
|---|---|
| Median length | 12 |
| Mean length | 11.9929232 |
| Min length | 3 |
race
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 208.6 KiB |
| White | |
|---|---|
| Black | 2118 |
| Hispanic | 1755 |
| Other or Multiple | 1612 |
| Value | Count | Frequency (%) | |
| White | 21222 | 79.5% | |
| Black | 2118 | 7.9% | |
| Hispanic | 1755 | 6.6% | |
| Other or Multiple | 1612 | 6.0% |
Length
| Max length | 17 |
|---|---|
| Median length | 5 |
| Mean length | 5.921443816 |
| Min length | 5 |
sex
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 208.6 KiB |
| Female | |
|---|---|
| Male |
| Value | Count | Frequency (%) | |
| Female | 15858 | 59.4% | |
| Male | 10849 | 40.6% |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 5.187553825 |
| Min length | 4 |
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 4423 |
| Missing (%) | 16.6% |
| Memory size | 208.6 KiB |
| <= $75,000, Above Poverty | |
|---|---|
| > $75,000 | |
| Below Poverty |
| Value | Count | Frequency (%) | |
| <= $75,000, Above Poverty | 12777 | 47.8% | |
| > $75,000 | 6810 | 25.5% | |
| Below Poverty | 2697 | 10.1% | |
| (Missing) | 4423 | 16.6% |
Length
| Max length | 25 |
|---|---|
| Median length | 13 |
| Mean length | 16.06488935 |
| Min length | 3 |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 1408 |
| Missing (%) | 5.3% |
| Memory size | 208.6 KiB |
| Married | |
|---|---|
| Not Married |
| Value | Count | Frequency (%) | |
| Married | 13555 | 50.8% | |
| Not Married | 11744 | 44.0% | |
| (Missing) | 1408 | 5.3% |
Length
| Max length | 11 |
|---|---|
| Median length | 7 |
| Mean length | 8.548058561 |
| Min length | 3 |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 2042 |
| Missing (%) | 7.6% |
| Memory size | 208.6 KiB |
| Own | |
|---|---|
| Rent |
| Value | Count | Frequency (%) | |
| Own | 18736 | 70.2% | |
| Rent | 5929 | 22.2% | |
| (Missing) | 2042 | 7.6% |
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.222001722 |
| Min length | 3 |
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 1463 |
| Missing (%) | 5.5% |
| Memory size | 208.6 KiB |
| Employed | |
|---|---|
| Not in Labor Force | |
| Unemployed | 1453 |
| Value | Count | Frequency (%) | |
| Employed | 13560 | 50.8% | |
| Not in Labor Force | 10231 | 38.3% | |
| Unemployed | 1453 | 5.4% | |
| (Missing) | 1463 | 5.5% |
Length
| Max length | 18 |
|---|---|
| Median length | 8 |
| Mean length | 11.66574306 |
| Min length | 3 |
hhs_geo_region
Categorical
| Distinct count | 10 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 208.6 KiB |
| lzgpxyit | |
|---|---|
| fpwskwrf | |
| qufhixun | |
| oxchjgsf | |
| kbazzjca | |
| Other values (5) |
| Value | Count | Frequency (%) | |
| lzgpxyit | 4297 | 16.1% | |
| fpwskwrf | 3265 | 12.2% | |
| qufhixun | 3102 | 11.6% | |
| oxchjgsf | 2859 | 10.7% | |
| kbazzjca | 2858 | 10.7% | |
| bhuqouqj | 2846 | 10.7% | |
| mlyzmhmf | 2243 | 8.4% | |
| lrircsnp | 2078 | 7.8% | |
| atmpeygn | 2033 | 7.6% | |
| dqpwygqj | 1126 | 4.2% |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
census_msa
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 208.6 KiB |
| MSA, Not Principle City | |
|---|---|
| MSA, Principle City | |
| Non-MSA |
| Value | Count | Frequency (%) | |
| MSA, Not Principle City | 11645 | 43.6% | |
| MSA, Principle City | 7864 | 29.4% | |
| Non-MSA | 7198 | 27.0% |
Length
| Max length | 24 |
|---|---|
| Median length | 19 |
| Mean length | 17.94593178 |
| Min length | 7 |
household_adults
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 249 |
| Missing (%) | 0.9% |
| Memory size | 208.6 KiB |
| 1 | |
|---|---|
| 0 | |
| 2 | 2803 |
| 3 | 1125 |
| Value | Count | Frequency (%) | |
| 1 | 14474 | 54.2% | |
| 0 | 8056 | 30.2% | |
| 2 | 2803 | 10.5% | |
| 3 | 1125 | 4.2% | |
| (Missing) | 249 | 0.9% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
household_children
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 249 |
| Missing (%) | 0.9% |
| Memory size | 208.6 KiB |
| 0 | |
|---|---|
| 1 | 3175 |
| 2 | 2864 |
| 3 | 1747 |
| Value | Count | Frequency (%) | |
| 0 | 18672 | 69.9% | |
| 1 | 3175 | 11.9% | |
| 2 | 2864 | 10.7% | |
| 3 | 1747 | 6.5% | |
| (Missing) | 249 | 0.9% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
| Distinct count | 21 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 13330 |
| Missing (%) | 49.9% |
| Memory size | 208.6 KiB |
| fcxhlnwr | |
|---|---|
| wxleyezf | |
| ldnlellj | |
| pxcmvdjn | 1037 |
| atmlpfrs | 926 |
| Other values (16) |
| Value | Count | Frequency (%) | |
| fcxhlnwr | 2468 | 9.2% | |
| wxleyezf | 1804 | 6.8% | |
| ldnlellj | 1231 | 4.6% | |
| pxcmvdjn | 1037 | 3.9% | |
| atmlpfrs | 926 | 3.5% | |
| arjwrbjb | 871 | 3.3% | |
| xicduogh | 851 | 3.2% | |
| mfikgejo | 614 | 2.3% | |
| vjjrobsf | 527 | 2.0% | |
| rucpziij | 523 | 2.0% | |
| Other values (11) | 2525 | 9.5% | |
| (Missing) | 13330 | 49.9% |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 5.504399596 |
| Min length | 3 |
| Distinct count | 23 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 13470 |
| Missing (%) | 50.4% |
| Memory size | 208.6 KiB |
| xtkaffoo | |
|---|---|
| mxkfnird | |
| emcorrxb | 1270 |
| cmhcxjea | 1247 |
| xgwztkwe | 1082 |
| Other values (18) |
| Value | Count | Frequency (%) | |
| xtkaffoo | 1778 | 6.7% | |
| mxkfnird | 1509 | 5.7% | |
| emcorrxb | 1270 | 4.8% | |
| cmhcxjea | 1247 | 4.7% | |
| xgwztkwe | 1082 | 4.1% | |
| hfxkjkmi | 766 | 2.9% | |
| qxajmpny | 548 | 2.1% | |
| xqwwgdyp | 485 | 1.8% | |
| kldqjyjy | 469 | 1.8% | |
| uqqtjvyb | 452 | 1.7% | |
| Other values (13) | 3631 | 13.6% | |
| (Missing) | 13470 | 50.4% |
Length
| Max length | 8 |
|---|---|
| Median length | 3 |
| Mean length | 5.478189239 |
| Min length | 3 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| respondent_id | h1n1_concern | h1n1_knowledge | behavioral_antiviral_meds | behavioral_avoidance | behavioral_face_mask | behavioral_wash_hands | behavioral_large_gatherings | behavioral_outside_home | behavioral_touch_face | doctor_recc_h1n1 | doctor_recc_seasonal | chronic_med_condition | child_under_6_months | health_worker | health_insurance | opinion_h1n1_vacc_effective | opinion_h1n1_risk | opinion_h1n1_sick_from_vacc | opinion_seas_vacc_effective | opinion_seas_risk | opinion_seas_sick_from_vacc | age_group | education | race | sex | income_poverty | marital_status | rent_or_own | employment_status | hhs_geo_region | census_msa | household_adults | household_children | employment_industry | employment_occupation | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 3.0 | 1.0 | 2.0 | 2.0 | 1.0 | 2.0 | 55 - 64 Years | < 12 Years | White | Female | Below Poverty | Not Married | Own | Not in Labor Force | oxchjgsf | Non-MSA | 0.0 | 0.0 | NaN | NaN |
| 1 | 1 | 3.0 | 2.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 5.0 | 4.0 | 4.0 | 4.0 | 2.0 | 4.0 | 35 - 44 Years | 12 Years | White | Male | Below Poverty | Not Married | Rent | Employed | bhuqouqj | MSA, Not Principle City | 0.0 | 0.0 | pxcmvdjn | xgwztkwe |
| 2 | 2 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | NaN | 1.0 | 0.0 | 0.0 | NaN | 3.0 | 1.0 | 1.0 | 4.0 | 1.0 | 2.0 | 18 - 34 Years | College Graduate | White | Male | <= $75,000, Above Poverty | Not Married | Own | Employed | qufhixun | MSA, Not Principle City | 2.0 | 0.0 | rucpziij | xtkaffoo |
| 3 | 3 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | NaN | 3.0 | 3.0 | 5.0 | 5.0 | 4.0 | 1.0 | 65+ Years | 12 Years | White | Female | Below Poverty | Not Married | Rent | Not in Labor Force | lrircsnp | MSA, Principle City | 0.0 | 0.0 | NaN | NaN |
| 4 | 4 | 2.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | 3.0 | 3.0 | 2.0 | 3.0 | 1.0 | 4.0 | 45 - 54 Years | Some College | White | Female | <= $75,000, Above Poverty | Married | Own | Employed | qufhixun | MSA, Not Principle City | 1.0 | 0.0 | wxleyezf | emcorrxb |
| 5 | 5 | 3.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | NaN | 5.0 | 2.0 | 1.0 | 5.0 | 4.0 | 4.0 | 65+ Years | 12 Years | White | Male | <= $75,000, Above Poverty | Married | Own | Employed | atmpeygn | MSA, Principle City | 2.0 | 3.0 | saaquncn | vlluhbov |
| 6 | 6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | 4.0 | 1.0 | 1.0 | 4.0 | 2.0 | 1.0 | 55 - 64 Years | < 12 Years | White | Male | <= $75,000, Above Poverty | Not Married | Own | Employed | qufhixun | MSA, Not Principle City | 0.0 | 0.0 | xicduogh | xtkaffoo |
| 7 | 7 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 5.0 | 2.0 | 1.0 | 4.0 | 2.0 | 1.0 | 45 - 54 Years | Some College | White | Female | <= $75,000, Above Poverty | Married | Own | Employed | bhuqouqj | Non-MSA | 2.0 | 0.0 | pxcmvdjn | xqwwgdyp |
| 8 | 8 | 0.0 | 2.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | 4.0 | 1.0 | 1.0 | 4.0 | 2.0 | 1.0 | 45 - 54 Years | College Graduate | White | Male | > $75,000 | Married | Own | Employed | bhuqouqj | MSA, Not Principle City | 1.0 | 0.0 | xicduogh | ccgxvspp |
| 9 | 9 | 2.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 4.0 | 2.0 | 2.0 | 4.0 | 2.0 | 2.0 | 55 - 64 Years | 12 Years | White | Male | <= $75,000, Above Poverty | Not Married | Own | Not in Labor Force | qufhixun | MSA, Not Principle City | 0.0 | 0.0 | NaN | NaN |
Last rows
| respondent_id | h1n1_concern | h1n1_knowledge | behavioral_antiviral_meds | behavioral_avoidance | behavioral_face_mask | behavioral_wash_hands | behavioral_large_gatherings | behavioral_outside_home | behavioral_touch_face | doctor_recc_h1n1 | doctor_recc_seasonal | chronic_med_condition | child_under_6_months | health_worker | health_insurance | opinion_h1n1_vacc_effective | opinion_h1n1_risk | opinion_h1n1_sick_from_vacc | opinion_seas_vacc_effective | opinion_seas_risk | opinion_seas_sick_from_vacc | age_group | education | race | sex | income_poverty | marital_status | rent_or_own | employment_status | hhs_geo_region | census_msa | household_adults | household_children | employment_industry | employment_occupation | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 26697 | 26697 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 4.0 | 2.0 | 2.0 | 4.0 | 2.0 | 2.0 | 65+ Years | College Graduate | White | Male | > $75,000 | Married | Own | Not in Labor Force | kbazzjca | MSA, Principle City | 1.0 | 0.0 | NaN | NaN |
| 26698 | 26698 | 2.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | 5.0 | 4.0 | 2.0 | 4.0 | 4.0 | 2.0 | 35 - 44 Years | College Graduate | White | Female | > $75,000 | Married | Own | Employed | atmpeygn | MSA, Not Principle City | 1.0 | 1.0 | dotnnunm | mxkfnird |
| 26699 | 26699 | 2.0 | 2.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 5.0 | 1.0 | 5.0 | 5.0 | 1.0 | 4.0 | 45 - 54 Years | Some College | White | Female | <= $75,000, Above Poverty | Married | Own | Employed | qufhixun | MSA, Not Principle City | 1.0 | 0.0 | pxcmvdjn | xgwztkwe |
| 26700 | 26700 | 3.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | 4.0 | 2.0 | 5.0 | 5.0 | 4.0 | 5.0 | 55 - 64 Years | 12 Years | White | Female | > $75,000 | Married | Own | Not in Labor Force | lzgpxyit | MSA, Principle City | 1.0 | 0.0 | NaN | NaN |
| 26701 | 26701 | 2.0 | 2.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 4.0 | 2.0 | 4.0 | 4.0 | 2.0 | 4.0 | 18 - 34 Years | College Graduate | White | Female | > $75,000 | Not Married | Rent | Not in Labor Force | fpwskwrf | MSA, Principle City | 3.0 | 0.0 | NaN | NaN |
| 26702 | 26702 | 2.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | 3.0 | 1.0 | 1.0 | 5.0 | 2.0 | 2.0 | 65+ Years | Some College | White | Female | <= $75,000, Above Poverty | Not Married | Own | Not in Labor Force | qufhixun | Non-MSA | 0.0 | 0.0 | NaN | NaN |
| 26703 | 26703 | 1.0 | 2.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1.0 | 1.0 | 4.0 | 2.0 | 2.0 | 5.0 | 1.0 | 1.0 | 18 - 34 Years | College Graduate | White | Male | <= $75,000, Above Poverty | Not Married | Rent | Employed | lzgpxyit | MSA, Principle City | 1.0 | 0.0 | fcxhlnwr | cmhcxjea |
| 26704 | 26704 | 2.0 | 2.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | 4.0 | 4.0 | 2.0 | 5.0 | 4.0 | 2.0 | 55 - 64 Years | Some College | White | Female | NaN | Not Married | Own | NaN | lzgpxyit | MSA, Not Principle City | 0.0 | 0.0 | NaN | NaN |
| 26705 | 26705 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | NaN | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3.0 | 1.0 | 2.0 | 2.0 | 1.0 | 2.0 | 18 - 34 Years | Some College | Hispanic | Female | <= $75,000, Above Poverty | Married | Rent | Employed | lrircsnp | Non-MSA | 1.0 | 0.0 | fcxhlnwr | haliazsg |
| 26706 | 26706 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 5.0 | 1.0 | 1.0 | 5.0 | 1.0 | 1.0 | 65+ Years | Some College | White | Male | <= $75,000, Above Poverty | Married | Own | Not in Labor Force | mlyzmhmf | MSA, Principle City | 1.0 | 0.0 | NaN | NaN |